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In the following algorithm, Step 1 calculates the appropriate execution time and processor 
utilization, Step 2.4 then calculates the control coupling as a combination of calculating: 
in Step 2.1 the latency, Step 2.2 the timing strength, and Step 2.3 the frequency strength. 
Step 2.5 then stores this new information into the coupling matrix. 

1. For each T it do: 

1.1 Choose P x from {default or each available type} 

1.2 Calculate execution time [ET] based on the chosen P 2 (single or mul- 
tiple) 

1.3 Calculate processor utilization T± [PU] based on the chosen P x (single 
or multiple) 

2. For each T it loop through all tasks {T n \ T ± —>T n ), 

2.1 Calculate the latency for each task coupling, TT ± __^ n [LT] , (assumes 

pipelined structure and uniform bus) as: 
TT ± [LT] - communication constant 

2.2 Rank coupling latency, TT± ^ n [LT] , against the Timing Step Function 
to calculate the timing strength for this coupling, TT ± [TS] 

2.3 Rank coupling frequency, TT ± ^ n [FQ] , against the Frequency Step Func- 
tion to calculate the frequency strength for this coupling, TT ± y n [FS] 

2.4 Calculate the control coupling as, 

TTi ^ n [CC] = max{ TT ± _^ n [TS] , TT± ^ n [FS] } + {l, if both TS & FS > 5 

| 0, else} 

2.5 Add the constraint TT ± [CC] to the coupling matrix 
End 

6.3.3.2 Data Couplings 

The system's data flow contains both task input and output, represented individually to 
support analysis of the data couplings, and subsequent partitioning and allocation deci- 
sions, prior to determining which data blocks are best implemented into messages, shared 
memory, or databases. 
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Figure 3 Input vs. Output Data Block Access 



2 . 1 . lWithin this sorted list, for each set of triples where 
max [processor . ratio, memory . ratio] is equal, 

2.1.1.1 sort this subset of triples by 

min [processor . ratio, memory . ratio] . 



End 



6.3.3 Calculate Couplings 

The following formulas calculate the control, data, and peripheral couplings from the 
domain model. 



The structure of the communication (inter-task and task-data) is assumed to be a pipelined 
message, which is used to calculate the latency in the couplings. This basic structure rep- 
resents both control and data flow, with data flow containing a volume of data, while con- 
trol flow consumes minimal bandwidth. Modifications of this algorithm easily extend to 

include the other remote procedure call and messaging structures, at any level of nesting. 

1 , , 2 



T, 
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TT^j + TTj 



TT^j+TTj 



k + TT^j + TTj^i 



Ficiirf 3 Pipflinffl vs. Nested Communication 

For example, the impact on data flow latency of different communication structures, such 
as rendezvous vs. direct schedule, adds the following calculations: 

1. CASE IPC is 

1.1 Rendezvous Output 
{TD Wj }[LT] = TitRST] - Dj [RR] 

1.2 Rendezvous Input 

{DTj ^ } [LT] = Ti[RST] - Dj [AT] 

1.3 Direct Schedule Output 

{TDi^j } [LT] = communication constant 

1.4 Direct Schedule Input 

{DT-j^i} [LT] = communication constant 



End Case 



section 6.3.3.3 Peripheral Couplings discusses the peripheral I/O specifics. 
6.3.3.1 Control Couplings 
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Figur^l Initial Hardware Suite 



7.5 Couplings 

Table 13 shows the triples for this example, implemented as a set of three columns (con- 
trol, data, and peripheral), and ten rows (values 1-10) representing the range of coupling 
strengths. 
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The results for this example system's initial estimates are shown in Figure 9. 




7.8 Allocation Results 

The resulting allocation for the example's original estimates is shown in Figure 10. 
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Figure^ Example Allocation 
The resulting worst case loads are shown below in Table 19. 
Table 19 Initial Platform Loads 



Platform 
(Component) 


Processor 
Capacity 


Processor 
Utilization 


Memory 
Capacity 


Memory 
Utilization 


PI (C4) 


1 5 mips 


68% 


300 mb 


84% 


P2 (C3 &C5) 


60 mips 


93% 


200 mb 


88% 


P3 (CI) 


70 mips 


89% 


lgb 


90% 


P4 (C2) 


50 mips 


90% 


1 gb 


80% 



Several architectural tradeoffs again must be made during allocation. For example, the 
Components CI and C2 fill up Processors P3 and P4, respectively. Had these two proces- 
sors exchanged locations on the ring bus with PI and P2, there would have been a conflict 
between allocating CI and C2 into processors that had sufficient capacity, versus the extra 
network traffic caused by the additional hops from the input sensors. Also, CI requires 
more CPU capacity than C2, due to the inclusion of Task Bl. Platform Loads 

Table 20 shows the resultant loads caused by modifying the platform capacities to com- 
bine the original first and second platforms (PI and P2 are now Px). As shown, Px con- 
tains almost 30% more memory than needed for the worst case load, while the CPU loads 
also have significant excess capacity. These worst case values are calculated to support the 
required response time, so this may indicate an opportunity to save on hardware costs. 
Table 20 Modified Platform Loads 



Platform 

(Component) 


Processor 
Capacity 


Processor 
Utilization 


Memory 
Capacity 


Memory 
Utilization 


Px (C3. C4 & C5) 


80 mips 


82% 


600 mb 


71% 


P3 (CI) 


70 mips 


86% 


lgb 


80% 


P4 (C2) 


50 mips 


90% 


lgb 


90% 



t 



Figure 1 1 describes one change that might result from the understanding of the peripheral 
bottleneck described in section 7.5 Couplings. Now the peripherals are re-located from a 
single place on a ring bus to direct processor connections. In this analysis, the initial con- 
figuration was used to decide that the peripherals needed to be moved. After changing the 
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Figure ^Modified Hardware Suite and Resulting Allocation 
model to reflect the changed hardware suite, the process then re-evaluated and generated 
the configuration shown. This confirmed that the change in the hardware suite was appro- 
priate, and illustrates the recursive impact that the partitioning and allocation tradeoffs 
have. 



